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]IM£QL,INHIBIT,..A^ 

Stat«a^t...j g _fesJM^a3.1 Z Sponsored. %«msgh 

5 This invention was made with Government support 

trader grant number FO1-CA42063, awarded by the National 
Institutes of Health. The government may have certain 
rights in the invention. 

10 The invention relates to methods of identifying 

compounds that disrupt polypeptide aggregation. The 
identified compound can be used to treat disorders 
associated with such aggregation such as Huntington* s 
disease or Alzheimer's disease. 

15 larkground p,f,,,the,,,,XTOJatiqn 

Huntington's disease (HD} is an autosomal dominant, 
progressive, neurodegenerative disorder associated with 
selective neuronal cell death, occurring primarily in the 
cortex and striatum. The disorder is caused by a 

20 CAQ codon repeat expansion in the first exon of a gene 

encoding a 350 kD protein, huntingtin, with unknown function 
{Ambrose et al. f Somat Cell Mol. Genet. 2Q;27~38, 1S94) . 
GAG encodes the amino acid glut amine ("Gin" or M Q" , } ? so CAG 
repeats encode polyglutamine regions within huntingtin. The 

25 polyglutamne region of huntingtin from non-HD individuals 
contains about 8-31 consecutive Gin residues. Huntingtin 
with over 37 consecutive Gin residues is associated with 
mild to severe HD, with the more severe cases exhibiting a 
polyglutamine region of up to about SS Gin residues. 

30 in addition to HD, at least six other inherited 

neurodegenerative disorders have been found to be associated 
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with CAG expansions. Increasing the length of GAG repeats 
in the coding region of unrelated genes, and resulting 
polyglut amine regions in the encoded proteins, causes a 
similar pattern of neuron degeneration, indicating a 
5 similar, if not identical, mechanism of cell death. It has 
been proposed that HD may be caused by abnormal protein- 
protein interactions mediated by elongated polyglut amines . 



ffuuBBry of the Invention 

The invention is based, in part, on the discovery of 

10 a method for identifying compounds which disrupt the 
aggregation of polypeptides. These compounds are 
potentially useful as therapeutics for the treatment of 
disease conditions associated with such aggregation. 

Accordingly, the invention features a method of 

IS identifying a compound which disrupts polypeptide 

aggregation. The method includes; providing a first 
polypeptide which is labelled with a detection moiety {e.g., 
an enzyme or a fluorescent protein) that is inactive in the 
presence of a denaturant, and a second polypeptide (which 

20 can be identical to the first) , wherein the first and second 
polypeptides aggregate upon contact; contacting the first 
polypeptide, the second polypeptide and a test compound to 
form a mixture ; contacting the mixture with the denaturant; 
and determining the activity of the detection moiety. A 

25 decrease in the activity following contact of the mixture 
with the denaturant indicates that the test compound has 
prevented at least some of the polypeptides from 
aggregating, thereby leaving them susceptible to 
inactivation by the denaturant, Such an outcome suggests 

30 that the test compound is a polypeptide aggregation 

disrupting compound. In the above method, the first or 
second polypeptide can be immobilized, or they both can be 
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in solution. Alternatively, they can be within a cell, 
e.g., a cell transfected with a DH& encoding the first 
polypeptide and/or the second polypeptide. The first and 
second polypeptides can be identical or different, so long 
S as they aggregate upon contact. The first and second 

polypeptides can be polypeptides that contain an extended 
polyglutamine region, beta-amyloid polypeptides, tau 
proteins* presenilis, alpha- synucleins and prion proteins. 
Examples of naturally occurring polypeptides that contain 

10 extended polyglutamine regions are huntingtin, atropin-1, 
ataxin-1, ataxin-2, ataxin-S, ataxin-7, alpha l&»v©ltage 
dependent calcium channel, and androgen receptor. Hon- 
naturally occurring polypeptides that contain an extended 
polyglutaradne region are polypeptides which include at least 

15 32 consecutive glutamine residues. In the above method, the 
detection moiety is preferably a fluorescent protein or an 
enzymz such as lucif erase, and the extended polyglutaasiine 
region is preferably at least 33 , 34, 35, 36, 37, 40, 42, 
47, 50, 52, SO, 65, 70, 72, 75, 80, 85, 95, 100, 104, 110, 

20 113, 120, 130, 140, 144, 151, ISO, 170, ISO, 190, 191, 195, 
200, 210, 230, 250, 270 or 300 glutamine residues in length. 

Alternatively, the method includes: providing a 
fluorescently labelled first polypeptide, wherein the first 
polypeptide contains an extended polyglutamine region,* 

25 providing a second polypeptide containing an extended 

polyglutamine region; contacting the first polypeptide, the 
second polypeptide and a test compound to form a mixture; 
denaturing unaggregated polypeptides in the mixture? and 
detecting fluorescence, wherein a decrease in fluorescence 

30 in the presence of the test compound indicates that the test 
compound is a polypeptide aggregation disrupting confound. 
The first and second polypeptides can be naturally or non- 
naturally occurring polypeptides that have at least 32 
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consecutive glutamics residues. As above } the first or the 
second polypeptide can be iamobilised or both polypeptides 
can be in solution. Alternatively, they can foe within a 
cell, e.g.,. a transfected cell which expresses both 
5 polypeptides. 

Another method of identifying a compound which 
disrupts the aggregation of polypeptides containing extended 
polyglutaraine regions " includes providing a cell which is 
genetically modified to express a Dm encoding a 

10 heterologous polypeptide containing an extended 

polyglutamine region,- contacting the cell with a test 
compound? and determining whether the test compound 
decreases the amount of aggregation of the polypeptide in 
the cell , wherein a decrease in polypeptide aggregation in 

15 the presence of the test compound indicates that the test 
compound is a polypeptide aggregation disrupting compound. 
The heterologous polypeptide can he, for example, a fusion 
protein comprising an antigenic tag or a label. Exatnples of 
labels include fluorescent proteins (e.g., a green 

20 fluorescent protein (GFP) or a blue fluorescent protein 
CBFP) } and enzymes . Where the label is a fluorescent 
protein or other denaturable protein, the step of 
determining whether the conspound is a aggregation disrupting 
coiipound includes contacting the cell with a denaturant such 

25 as detergent or heat sufficient to effect denaturing of the 
label portion of unaggregated fusion protein, and detecting 
fluorescence, wherein a decrease in fluorescence following 
contact of the cell with the denaturant, compared to 
fluorescence in a similar cell that is treated with the 

30 denaturant but not the test cosmpound, indicates that the 
compound is a polyglutamine polypeptide aggregation 
disrupting compound. The expression of the IMA. can be 
inducible, e.g., expression can be induced upon exposure of 
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the cell to an inducing agent such as ecdysone or 
maristerone. 

A final method of identifying a compound which 
disrupts the aggregation of polypeptides includes the steps 
5 of providing a cell that is genetically modified to express 
a DNA encoding a heterologous polypeptide, wherein molecules 
of the polypeptide spontaneously aggregate within the cell; 
contacting the cell with a test compound; and determining 
whether molecules of the polypeptide aggregate in the 

10 presence of the test compound, wherein a decrease in 

aggregation of the polypeptide molecules in the presence of 
the test compound indicates that the test cotapound is a 
polypeptide aggregation disrupting compound. The 
polypeptide can be a fusion protein comprising a label such 

15 as a fluorescent protein (e.g., a GFP or a BFP) or an 

enssyme* The method can further include contacting the cell 
with a denaturant such as a detergent or heat, and detecting 
fluorescence or other activity of the label, wherein a 
decrease in fluorescence or activity compared to a control 

20 not essposed to the test compound indicates that the compound 
is a polypeptide aggregation disrupting compound. 

The invention features a DN& encoding a fusion 
protein which includes (a) at least 32 contiguous glutamine 
residues and (b) a label (e.g., a fluorescent protein such 

25 as GFP or BFP or an enzyme such as lucif erase) , wherein the 
sequence encoding the at least 32 glut amine residues 
comprises both CMS codons and CM. codons. The CAG and CM 
codons can be present as a mixture in the DMA, e.g., 
containing the sequence CAA CAG CAG CAA CAG CM (SEQ ID 

30 NO;l) f e.g., {CAA CAG CAG CAA CAG CAA) n (SEQ ID MGsl), where 
n can be between 7-300, e.g., n is 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29 f or 30. The CAG and CAA codons need not be 
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present in equal numbers. For example, CAS. could be every 
third or fourth codon in the polyglut amine -encoding OTA, 
Xhe CAG and CAA codons can be present in a repeating 
pattern,, or can be a random mixture. 
5 other possible labels include other florescent 

proteins, enzymes, and any other protein which can be used 
to distinguish between aggregated and non- aggregated 
polyglutajnine-containing proteins. The invention further 
features a cultured, genetically modified cell which 

10 expresses the above described DMA, and a method of producing 
a fusion protein, comprising culturing the genetically 
modified cell under conditions appropriate for stressing 
the DMA encoding the fusion protein. 

Also within the invention is a fusion polypeptide 

IS comprising (a) at least 32 contiguous glutamine residues and 
(b) a fluorescent protein such as a GPP or BFP. In 
preferred embodiments, the poiyglutamine region contains at 
least 33 glutamine residues, and more preferably at least 
34, 35, 36, 37, 40, 42, 47, 50, 52, 60, 65, 70, 72, 75, 80, 

20 85, 95, 100, 104, 110, 119, 120, 130, 140, 144, 151, ISO, 
170, 180, 190, 191, 195, 200, 210, 230, 250, 270 or 300. 

The invention also features an expression plasmid 
which (1) includes a DNft sequence which encodes a fusion 
protein of (a) at least 32 contiguous glutamine residues and 

25 (b) a label, wherein the sequence that encodes the at least 
32 glutaiaine residues includes both CAG codons and CM 
codons, and (2) is operably linked to an expression control 
sequence. 

An expression control sequence "operably linked 55 to 
30 a coding sequence is placed so that it controls egression 
of the latter. 

An "isolated DMA" is a DMA which has a non-naturaily 
occurring sequence, or which has the sequence of part or all 
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of a naturally occurring gene but is free of the genes that 
flank the naturally occurring gene of interest in the _ 
of the organism in which the gene of interest naturally 
occurs. The term therefore includes a recombinant DN& 
5 incorporated into a vector, into an autonomously replicating 
plasmid or virus, or into the genomic DN& of a prokaryote or 
eukaryote. It also includes a separate molecule such as a 
cDMA, a genomic fragment, a fragment produced by polymerase 
chain reaction (PCR) , or a restriction fragment. It also 
10 includes a recombinant nucleotide sequence that is part of a 
hybrid gene, i.e., a gene encoding a fusion protein. 
Specifically excluded from this definition are Dm raolecules 
as they occur in a random library, such as a cDNA or genomic 
DM& library. 

15 A "polypeptide™ is any peptide -linked chain of amino 

acids, regardless of length or post-translational 
modification. 

An "heterologous polypeptide" is defined in 
reference to a given cell: i.e., it is a polypeptide that is 

20 not normally expressed in more than a trace amount within 
the given cell. A polypeptide with a non~naturally 
occurring sequence {e.g., the fusion proteins of the 
invention) is heterologous to all cell types. Even a 
polypeptide with a naturally occurring sequence (e.g., human 

25 huntingtin) would be considered, an heterologous polypeptide 
if it were expressed in a non-human cell, or in a human cell 
in which it is not normally expressed in more than a trace 

An "aggregation-disposed polypeptide" refers to a 
30 polypeptide which aggregates with a second polypeptide when 
contacted with the latter. The second polypeptide can have 
the same or a different 
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m "inducing agent" is an agent that triggers or 
increases expression of a coding sequence. 

The term "label", as used herein, refers to a 
detection moiety whose detection properties are altered 
S either (i) directly as a consequence of polypeptide 

aggregation or (ii) upon essposure to an agent following 
polypeptide aggregation. 

The term "aggregation" refers to a process whereby 
polypeptides stably associate with each other to form a 
10 multimeric, insoluble complex, which does not disassociate 
under physiological conditions. 

An H extended polyglutainina region 1 * refers to a 
region of 32 or more (e.g., at least 33, 34, 35, 36, 37, 40, 
42, 47, 50, 52, SO, 65, 70, 72, 75, 80, 85, 95, 100, 104, 
IS 110, 119, 120, 130, 140, 144, 151, 160, 170, 180, 100, 191, 
195, 200, 210, 230, 250, 270 or 300) consecutive glutamine 
residues. Polypeptides that contain such regions aggregate 
upon contact though not necessarily immediately . 

A "conservative amino acid substitution" 1 is one in 
20 which the ansino acid residue is replaced with another 

residue having a chemically similar side chain. Families of 
amino acid residues having similar side chains have been 
defined in the art. These families include amino acids with 
basic side chains (e.g., lysine, arginine, hietidine) , 
25 acidic side chains (e.g., aspartic acid, glutamic acid), 
uncharged polar side chains (e.g., glycine, asparagine, 
glutamine, serine, threonine, tyrosine, cysteine), nonpolar 
aide chains (e.g., alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan) , beta* 
30 branched side chains (e.g., threonine, valine, isoleucine) 
and aromatic side chains (e.g., tyrosine, phenylalanine, 
tryptophan, histidine) . 

- 8 - 
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" Percent sequence identity" of two amino acid 
sequences or of two nucleic acids is determined using the 
algorithm of Karlin and Altschul {Proc. Natl, Acad, Sci. USA 
87:2264-2268, 1990} , edified as in Karlin and Altschul 
S {Proc. Natl. Acad. Sci. USA 90:5873-5877, 1993). Such an 
algorithm is incorporated into the HBLAST and XBLAST 
programs of Altschul et al. "(J. Hbl, Biol. 215:403-410, 
1990. BLAST nucleotide searches are performed with the 
HBLAST program, score = 100, wordlength = 12 to obtain 

10 nucleotide sequences homologous to a nucleic acid molecules 
of the invention. BLAST protein searches are performed with 
the XBLAST program, score = 50, wordlength = 3 to obtain 
amino acid sequences homologous to an aggregation-disposed 
polypeptide. To obtain gapped alignments for comparison 

15 purposes, Gapped BLAST is utilised as described in Altschul 
et al. (Nucleic Acids Res. 25:3389-3402, 1997). When 
utilizing BLAST and Gapped BLAST programs, the default 
parameters of the respective programs (e.g., XBLAST and 
MBLAST) are used. See http://www.ncbi.nlm.nih.gov. 

20 Unless otherwise defined, all technical and 

scientific terms used herein have the same meaning as 
commonly understood by one of ordinary skill in the art to 
which this invention belongs. In case of conflict , the 
present application, including definitions, will control. 

2S All publications, patent applications, patents, and other 
references mentioned herein are incorporated by reference. 
The materials, methods, and examples are illustrative only 
and not intended to be limiting. Other features and 
advantages of the invention will be apparent from the 

30 detailed description, and frosts the claims. 
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The invention is based, in part, on the discovery of 
a method which can be used to identify cotEspounds which 
inhibit the aggregation of polypeptides. The aggregation of 
5 certain naturally occurring polypeptides is often associated 
with pathological disorders such as Alzheimer's disease, 
Parkinson's disease and Huntington's disease, A confound 
which inhibits aggregation of naturally occurring 
aggregation-disposed polypeptides can be used to treat a 
10 subject at risk for such a disorder. 

PolxEiEtides 

The invention includes screening methods which are 
used to identify compounds which can disrupt the aggregation 
of aggregation-disposed polypeptides. An aggregation- 

15 disposed polypeptide can be a naturally occurring 

polypeptide or a non~ naturally occurring polypeptide. 

The aggregation of naturally occurring polypeptides 
is often associated with pathological disorders. Examples 
of naturally occurring polypeptides which aggregate include 

20 polypeptides which contain extended polyglutamine regions, 
herein defined to mean at least 32 contiguous glutamine 
residues. Such polypeptides and their associated disorders 
are as follows: huntingtin, which is associated with 
Huntington's disease? atrophia™ 1, which is associated with 

25 dentatorubralpallidoluysian atrophy; ataxin-1, which is 
associated with spinocerebellar ataxia type 1? ataxin-2 s 
which is associated with spinocerebellar ataxia typa~2; 
ataxin-3 f which is associated with spinocerebellar ataxia 
type-3; alpha la-voltage dependent calciura channel, which is 

30 associated with spinocerebellar ataxia type- 6 f - ataxin-7, 

which is associated with spinocerebellar ataxia type-7? and 
androgen receptor, which is associated with spinobulber 
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muscular atrophy. Other naturally occurring polypeptides 
known for their ability to aggregate include the synuclein 
proteins, namely alpha, beta and gamma synucleins. 
Synucleins have been implicated in Alzheimer's disease, 
5 Parkinson's disease and breast cancer, Proteins such as 
amyloid light chains and amyloid-associated proteins, which 
are associated with amyloidosis, can also he used in the 
methods of the invention. Other aggregation-disposed 
polypeptides include: mutant . transthyretin* which is 
10 associated with familial amyloid polyneuropathies; beta2 
microglobulin, aggregation of which causes complications 
during chronic renal dialysis; beta amyloid protein, which 
is associated with Alzheimer's disease; immunoglobulin light 
chain, which is associated with multiple myelomas and 
15 -various other B-cell proliferations; and prion proteins, 
which cause spongiform encephalopathies like Creutsfeldt- 
Jakob disease and kuru in humans. 

Non-naturally occurring, aggregation-disposed 
polypeptides include variants of naturally occurring 
20 polypeptides, as well as polypeptides which do not occur in 
nature but have the ability to aggregate, particularly where 
such polypeptides can.be used to model naturally-occurring, 
disease-associated proteins such as huntingtin and beta 
amyloid protein. These include polypeptides which are 
25 engineered to include regions, such as an extended 

polyglut amine region, which are known to promote polypeptide 
aggregation. 

Naturally occurring polypeptides of the invention 
can be obtained by isolating and purifying the protein from 
30 a natural source. Alternatively, both naturally and non- 

naturally occurring aggregation-disposed polypeptides can be 
produced recombinant ly or chemically synthesized by 
conventional methods, An aggregation-disposed polypeptide, 
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full-length or truncated, can also be part of a fusion 
protein, e.g., the protein can be fused to an antigenic tag 
such as c-tnyc or proteinaceous label such as a green 
fluorescent protein (GFP) . 
5 Techniques for generating polypeptides are well 

known in the art. A typical method involves transfecting 
host cells {e.g., bacterial cells, insect calls, mammalian 
cells, or plant cells) with an expression vector carrying a 
nucleic acid that encodes a polypeptide of interest. The 

10 cell in which the recombinant polypeptide is produced can be 
used directly in the methods of the indention, or the 
recombinant polypeptide can be purified from the culture 
medium or from a lysate of the cells. 

Variants of the aggregation-disposed polypeptides 

15 can also be used in the methods of the invention and can be 
prepared by substituting selected amino acids in these 
polypeptides. A variant of an aggregation-disposed 
polypeptide includes a polypeptide which has high sequence 
identity (e.g., 60%, 70%, 80%, 90, 95, 96, SI, S8 or 99%) to 

20 an aggregation-disposed polypeptide of above and retains the 
ability to aggregate. 

Also useful for the methods of the invention are 
aggregation-competent portions of the naturally occurring 
aggregation-disposed polypeptides, e.g., a fragment of a 

25 naturally occurring polypeptide containing an extended 
polyglutamine region or other region that promotes 
aggregation of the parent protein with copies of itself or 
with a different protein. 

Also included in the invention are aggregation- 

30 disposed fusion protein® , e.g., a fusion protein containing 
an extended polyglutamine region and a green fluorescent 
protein (GFP) (which term includes enhanced GFP, or "EGFP" ) . 
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Nucleic acid molecules! 

isolated nucleic acid molecules that encode 
naturally occurring, aggregation-disposed polypeptides, 
•variants thereof, or non-naturally occurring aggregation - 
5 disposed polypeptides are useful in the methods of the 

invention. Naturally occurring nucleic acid sequences which 
encode aggregation-disposed polypeptides are well known in 
the art, e.g., sequences which encode huntingtin {Genbank 
accession # NMD 02 11} , atrophia- 1 
10 (Genbank accession # AFG3S5S4) , ataxia- 1 
(Genbank accession # ALQQ931) , ataxin-2 
(Genbank accession # AFG34373) , ataxin~3 

(Genbank accession # NM004933) , alpha la-voltage dependent 
calcium channel (Genbank accession # AI660731) , ataxin-7 

15 (Genbank accession # A16S0731) , androgen receptor 

(Genbank accession # AI759506.) , alpha, beta ^nd garama 
synucleins (Genbank accession' ## 2MG0308S, AXS7S1S7, and 
NM003087, respectively) / amyloid light chain 
(Genbank accession # AF026929) and amyloid-associated 

20 protein (Genbank accession # AF0533S6) . Kucleic acid 
sequences that encode fragments of naturally occurring, 
aggregation- disposed polypeptides which retain the ability 
to aggregate are also useful in the methods of the 
invention. 

25 In some instances, it may be preferable to generate 

a non-naturally occurring polypeptide which encompasses a 
region (e.g., an extended stretch of contiguous glutatsdne 
residues) which is known to be involved in polypeptide 
aggregation. For example, in bacteria the ability to 

30 recombinant ly produce a polypeptide with an extended 

polyglut amine region is difficult, possibly because DMA or 
RHA containing multiple contiguous CAG codons may form 
secondary structures which affect replication and/or 
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transcription. Whatever the mechanise causing this 
difficulty, one can overcome it by using a nucleic acid 
sequence with alternating CAG and CM codons to encode the 
polyglot amine region, e.g., alternating CAG and CAA codons 
S or another pattern such as {CAA CAG CAG CM, CAG C&A) S {SEQ 
ID NO;lS . where n can be between 5-300, e.g., n is 5, 6, 7» 
B/ s t 10, 11, 12, 13, 14, IS, 16, 17 f 18, 19, 20, 21, 22, 
23 , 24, 25, 26, 27, 28, 29, 30. The CAG and CAR codons need 
not be present in equal nunibers and need not form a 
10 repeating pattern. 

gSBESBiifllL Control,, Sequences aMJfectors 

Two ways in which the methods of the invention can 
be carried out are: (i) using a cell which has been 
genetically modified to express aggregation-disposed 
15 polypeptides,- and (ii) using purified aggregation-disposed 
polypeptides. 

Typically, expressing an aggregation-disposed 
polypeptide in a cell involves inserting' an aggregation- 
disposed polypeptide coding sequence into a vector, where it 
20 is operably linked to one or more expression control 

sequences. The need for and identity of expression control 
sequences will vary according to the type of cell in which 
the aggregation-disposed polypeptide sequence is to be 
expressed. Examples of expression control sequences include 
25 transcriptional promoters, enhancers, suitable teiHE& 
ribosotnal binding sites, and sequences that terminate 
transcription and translation. 

Suitable expression control sequences can be 
selected by one of ordinary skill in the art. Standard 
30 methods can be used by the skilled person to construct 

expression vectors. See, generally, Saunbrook et al . , 1989, 
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Cloning ~ A Laboratory Manual (2nd Edition) , Cold Spring 
Harbor Press. 

Vectors useful in this invention include plasmid 
vectors and viral vectors. Viral -vectors can be, for 
5 example, those derived from retroviruses, adenovirus, adeno- 
associated virus, SV40 virus, pox viruses, or herpes 
viruses. Once introduced into a host cell (e.g., bacterial 
cell, yeast cell, insect cell, avian cell, or mammalian 
cell) , the vector can remain episoroal, or be incorporated 

10 into the genome of the host cell. Useful vectors include 
vectors which can be purchased commercially f e.g., pclMA 
3,1 -based vectors can be purchased from Xnvitrogen, 
Carlsbad, C&. pcDNA 3,1 -based vectors inc; de the human 
cytomegalovirus (CMV) immediate-early promoter/enhancer for 

15 high level expression in maiomalian cell lines, and bovine 
growth hormone (BGH) polyadenylation signal for efficient 
transcript stabilization and termination, 

To generate a purified preparation of the 
aggregation -disposed polypeptide for use in the present 

20 method, the aggregation-disposed polypeptide can be produced 
reconsbinantly in a cell (as described above) and then 
purified from that cell, or the polypeptide can be made 
synthetically. Since aggregation- disposed polypeptides 
aggregate, it may be necessary to take certain steps when 

25 producing and isolating the polypeptide so that a soluble, 

non~ aggregated form of the polypeptide can be obtained. For 
example, it is preferable when producing the polypeptide 
recombinantly in a cell not to over-produce the polypeptide , 
as over-production of aggregation-disposed polypeptides in a 

30 cell may result in polypeptide aggregation. 
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Labeling polypeptides 

The aggregation-disposed polypeptides can be 
chemically coupled to a label or recorabinantly expressed as 
a fusion protein with, a label. Examples of labels include 
5 various enzymes, fluorescent materials f luminescent 
materials, and bioluminescent materials. Examples of 
suitable enzymes include horseradish peroxidase, alkaline 
phosphatase, j3-galactosidase, and acetylcholinesterase ; 
examples of suitable fluorescent materials include 

10 umbelliferone, fluorescein, fluorescein isothiocyanate, 
rhodamine, dichlorotriazinylamine fluorescein, dansyl 
chloride and phycoerythrin; an example of a luminescent 
material is luminol; and examples of bioluminescent 
materials include lucif erase , luciferin, and aeguorin. 

15 The coupling of a label to a polypeptide of the 

invention can be carried out by chemical methods known in 
the art. A variety of coupling agents, including cross- 
linking agents, can be used for covalent conjugation. 
Examples of cross-linking agents include H,H ? ~ 

20 dicyclohescylcarbodiimide CDCC; Pierce) , N-succinimidyl-S- 
acetyl-thioacetate (SATA) , N-succinitnidyl~3~ 
(2 -pyridyldithio 5 propionate (SPDP) , 

ortho-phenylenedimaleimide (o-PDF!) , and sulfosuccinimi&yl 
4- (N-tnaleimidomethyl) cyclohexane-l-oarboseylate (sulfo- 

25 SMCC) . See, e.g., Karpovsky et al., J. Exp. Med. 260:1686, 
1984; and Liu et al. f Proc. Natl. Acad. Sci. USA 82 S 8S4S, 
1985. Other methods include those described by Paulus f 
Behring Ins. Mitt., No. 78, 118-132, • 1985? Brennan et al. 
Science 229: 81-83, 1985, and Glennie et al, f J". Imtmmol, 

30 23S;2367~237S, 1987. A large number of coupling agents for 
polypeptides, along with buffers, solvents, and methods of 
use, are described in the Pierce Chemical Co. catalog, 
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pages T-155 - T-2QQ, 1994 (3747 N. Meridian Rd» f Rockford 
IL, 61105, U.S.A.,; Pierce Europe B.V., P.O. Box 1512, 3260 
EA Oud Baijerland, The Netherlands), which catalog is hereby 
incorporated by reference. 
5 Fluorescence labeling can be achieved by purifying 

an aggregation-disposed polypeptide and covalently 
conjugating the polypeptide to a reactive derivative of an 
organic fluorophore. Examples of suitable fluorophorea 
include fluorescein, rhodamine, Texas Red, and the like. 

10 Fluorescence may be detected by any method known in the art f 
e.g., using fluorescent microscopy, a f luorometer, or 
fluorescence-activated cell sorting (FAGS) , 

Where the label is a protein, e.g., an enzyme or a 
fluorescent protein, it can be made an integral part of the 

15 aggregation-disposed polypeptide by expressing the two 
together as a recombinant fusion protein, as discussed 
above. Suitable fluorescent proteins include green 
fluorescent protein (GFP) and blue fluorescent protein 
CBFP) . 

20 The GPF gene was originally cloned from the 

jellyfish Aeguorea Victoria. It encodes a protein of 
238 amino acids which absorbs blue light (major peak at 
395 rim) and emits green light (major peak at 509 nm) 
(Prasher et al . , Gene 15:229-223, 1992). GPF genes and 

25 functional proteins have been identified in a variety of 
organisms in the phyla hydrozoa, cnidaria, anthozoa and 
ctenophora . 

Both wild-type GFP and mutated GFP from Aeguorea 
Victoria can be used as a label. The mutation of GFP (e.g., 
30 the substitution of certain amino acids in the GFP 

polypeptide) has been reported to yield GFP proteins with 
improved spectral properties, For example, mutating serine 
S5 to a threonine generates a GFP variant which has about 
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sixfold greater brightness than wild- type GFP (Heim et al. t 
Mature 372=663-664, 1935). The coding sequence for an 
enhanced GPP can be purchased commercially (Clcmtech, Palo 

Alto, cm) 

BPF can also be used as a label. To obtain BFP, 
tyrosine 66 of GFP is mutated to a histidine. This mutated 
GFP protein fluoresces bright blue, in contrast to the green 
of the wild-type protein. 



s invention encompasses methods for identifying 
compounds that disrupt the aggregation of particular 
polypeptides, e.g., the aggregation of huntingtin 
polypeptides. Candidate compounds that can be screened in 
accordance with the invention include polypeptides, peptide 
mimetics, antibodies ,. and monomeric organic compounds, i.e., 
"small molecules." In particular, certain classes of 
compounds may be chosen by one skilled in the art based on 
knowledge of the mechanism of aggregation of particular 
aggregation- disposed polypeptides. For example, aggregation 
of huntingtin polypeptides is believed to be mediated by 
hydrogen bond formation. Based on this, compounds such as 
D-amino acid- containing peptides and compounds that compete 
for H bond formation can be tested by the method of the 
invention to determine if these compounds function as useful 
aggregation disrupting compounds. 



Labels 

To determine if a compound disrupts polypeptide 
aggregation, a method of detecting the extent to which 
polypeptides aggregate in the presence of the compound is 
required. This is accomplished in the methods of the 
invention by labeling the aggregation-disposed polypeptide 
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with a detection moiety, a detectable property of which 
changes (i.e., is lost, gained, or changed in character} 
directly or indirectly, based on whether the polypeptide is 
in an aggregated state or not. For example, in one 
embodiment, the property of the detection moiety is 
eliminated, or at least decreased, as a consequence of 
aggregation, so that the label on the unaggregated 
polypeptide exhibits the property while the label on. the 
aggregated polypeptide does not. An example of this would 
be an enzymatic label which is active only when in an 
unaggregated state. Alternatively, the inverse can be true. 
For example, when a denaturant -sensitive label is used, 
exposure to a denaturant abolishes the detectable property 
of the label, if the label is linked to a nan -aggregated 
polypeptide. Once the polypeptides aggregate, the detection 
moiety is protected from the denaturant and retains its 
detectable property even after treatment with the 
denaturant. The label can be any denaturant -sensitive 
detection moiety, e.g., enzymatic or fluorescent. 

Where the label is an enzyme, a property of the 
enzyme, e.g., the ability to catalyse a particular reaction, 
may alter as a consequence of aggregation, For example, 
upon aggregation, the enssymatic activity of the label may be 
eliminated. Thus, the extent of polypeptide aggregation in 
the presence of the test compound is determined by 
determining the ability of the enzyme to catalyse the 
reaction. An increase in the amount of enssyme activity in 
the presence of the test compound, as compared to a control, 
indicates that the compound is an aggregation disrupting 



Alternatively, the enzyme may be one which is 
inactivated in the presence of a denaturant. Since 
aggregation protects the enzyme from denaturing, addition of 
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a denaturant permits one to determine whether the 
polypeptide linked to the enzyme is aggregated or not. If 
the amount of enzyme activity is lower in the presence of 
denaturant and test corsspound, coraipared to denaturant alone, 
5 the test compound is a putative aggregation disrupting 

Where the label is a fluorescent protein, e.g., a 
GFP, the ability of the fluorescent -labeled polypeptide to 
fluorescence in the presence of a denaturant following 

10 aggregation is determined. For example, when GFP is used as 
the label , the addition of a denaturant causes the soluble, 
non- aggregated, GFP -labeled polypeptides to denature, 
thereby quenching fluorescence. In contrast, aggregated 
GFP- labeled proteins are sufficiently protected from the 

15 denaturant so that the GFP of the aggregated GFP- labeled 
polypeptides will continue to fluoresce in the presence of 
the denaturant. Thus, a decrease in fluorescence of the 
GFP- labeled polypeptide in the presence of denaturant plus 
test compound, as compared to a control with denaturant 

20 alone, indicates that the test compound is an aggregation 
disrupting; compound. 

The label can also be a selectable marker, such as 
an antibiotic resistance marker. In this instance, the 
aggregation- disposed polypeptide is expressed in a cell as a 

25 fusion with a selectable marker, e.g., neomycin 

phosphotransferase (neo) . The aggregation of polypeptides 
fused to selectable markers in a cell inhibits the ability 
of the selectable marker to confer antibiotic resistance on 
that cell during selection. Thus, where a selectable marker 

30 is used, the ability of the polypeptides to aggregate in the 
presence of a compound is determined by measuring cell 
•viability in the presence of a selection agent, e.g., an 
antibiotic. For example, where the selection marker is neo s 



the selection agent aminoglycoside G-418 can be used, or 
where the selectable marker is hygro, the selection agent is 
typically hygramycin. An increase in cell viability in the 
presence of a test compound plus selection agent, compared 
5 to the selection agent alone, is an indication that the test 
compound is an aggregation disrupting polypeptide, 

lgSS£i&iS& Uo n, of. a .. e gmpound that disrupts the aggregation 
of aggreq,atipn--d isi3osed poly pept: i few 

In one screening method of the invention, labeled 
10 (e.g., GFP labeled) aggregation-disposed polypeptides are 
incubated with a test compound of interest. Following a 
period of time sufficient to permit polypeptide aggregation, 
the polypeptide/test compound mixture is contacted with a 
denaturant. The denaturant can be any agent (e,g, f heat, 
15 urea* guanidine HCL, a detergent such as Triton or sodium 
dodecyl sulfate, or a mixture thereof} that is able to 
quench fluorescence of non- aggregated GFP-labeled 
polypeptides, but which is unable significantly to quench 
fluorescence of aggregated GPP-labeled polypeptides. The 
20 extent of aggregation in the presence of the test compound 
is determined by measuring fluorescence. Fluorescence can 
be measured by .any method known in the art, e.g., using a 
fluorometer. A decrease in the amount of denaturant- 
fluorescence in the presence of the teg 



25 as compared to a control, is an indication that the test 
compound is an aggregation disrupting compound. 

The above method requires that at least two 
aggregation-disposed polypeptides are contacted. In this 
method, both polypeptides can be in solution (e.g., in a 

30 cell or in vitro}, or alternatively one of the aggregation- 
disposed polypeptides can be immobilised, e.g., a GST-GFP- 
labeled aggregation- disposed polypeptide can be immobilized 
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on a polymeric bead or a plastic dish, coated with 
glutathione - 

A variation of the above method involves expressing 
aggregation-disposed polypeptides in a cell in the presence 
5 of a test compound of interest. The oell is transfected 
with an expression vector containing a nucleotide sequence 
that encodes a labeled aggregation-disposed polypeptide, and 
contacted with the test compound. Following a suitable 
incubation period that permits expression and aggregation of 

10 polypeptides within the cell, the cell is contacted with a 
denaturant, e.g., a detergent, and the function of the label 
is measured. Where the label is a GFP or BFP, the amount of 
fluorescence in the cell is measured and compared to a 
control cell that was not exposed to the test compound, A 

15 decrease in fluorescence in a cell exposed to the test 

compound, as cornpared to a control cell, is an indication 
that the test compound disrupts polypeptide aggregation. 

The above method can be performed in any cell, such 
as an immortalized cell, a primary cell, or a secondary 

20 cell. Examples of immortalized cells include COS, Chinese 
hamster ovary (CHO) , HeLa, Vero, WI3S, HepG2 , 3T3, RIN, 
MBCK, A549, PC12, KS62 and 293 cells. Neuronal cells can be 
used, as well. 

Typically, the aggregation-disposed polypeptides are 

25 expressed in a cell using an egression "vector. A person 
skilled in the art would be able to choose an appropriate 
expression vector. For example, expression vectors for use 
in mammalian cells ordinarily include an origin of 
replication and a promoter located in front of the gene to 

30 be expressed. A polyadenylation site and transcriptional 
terminator sequence are preferably included. Ribosome 
binding sites and RNA splice sites may also be included. An 
example is the SV40 late gene 16S/19S splice/donor acceptor 
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signal. The promoter in the expression, vector can be a 
constitutive promoter or an inducible promoter, Preferably, 
expression of the aggregation-disposed polypeptide is under 
the control of an inducible promoter. Commercially 
5 available inducible expression systems can be used, e.g., 
the ecdysone-inducible expression system (Invitrogen, 
Carlsbad, CA; see example section) . 

A vector which expresses the aggregation-disposed 
polypeptides may be introduced into cells by a variety of 

10 physical or chemical methods , including alectroporation, 
microinjection, mieroprojectile boirsbardment f calcium 
phosphate precipitation, and liposome" , polybrene- , or DEAE 
dextran-mediated transfection. Alternatively, infectious 
vectors such as retroviral, herpes, and adenovirus » 

15 associated vectors can be used to introduce the DNJk. 

Administrati^n,rf...t.he polypeptide aggregation disrupting 

Once a given compound is found to have aggregation- 
disrupting activity in one of the above screening methods, 

20 it can be tested for safety and efficacy in an animal model 
(if there is one) for the disease is) associated with the 
aggregation-disposed polypeptide, or in a human susceptible 
to the disease. 

Administration of the compound to a subject (e.g., a 

25 human) may be by "any known technique. The compound can be 
administered to a subject by oral ingestion, intravenous 
injection, intramuscular injection, intrathecal injection, 
or bronchi-nasal spraying. The invention also pertains to a 
pharmaceutical composition of the aggregation disrupting 

30 compound. The composition includes the compound in a 
therapeutically effective amount sufficient to inhibit 
(i.e., decrease) the aggregation of the target polypeptides 
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in cells of the patient, and a pharroaceuticaily acceptable 
carrier. A ,! therapeutically effective amount" refers to an 
amount effective, at dosages and for periods of time 
necessary, to achieve the desired result. A therapeutically 
S effective amount of the compound tsay -vary according to 

factors such as the disease state, age, sex, and weight of 
the individual. Dosage regimens may be adjusted to provide 
the optimum therapeutic response. A therapeutically 
effective amount is also one in which any toxic or 

10 detrimental effects of the compound is outweighed by the 
therapeutically beneficial effects. 

One factor that may be considered when determining a 
therapeutically effective amount of a confound is the 
concentration of the target polypeptide in a biological 

15 compartment of a subject, such as in the cerebrospinal fluid 
(CSF) or brain of the subject. For example, the 
concentration of natural beta- amyloid protein in the CSF has 
been estimated at 3 nM (Bchwartssraan, Proc Natl. Acad. Sci, 
USA 91:8368-8372, 1994). A non-limiting range for a 

20 therapeutically effective amount of a beta amyloid 

aggregation disrupting compound in the CHS is 0.01 nM-10 j«M. 
It is to be noted that dosage values may vary with the 
severity of the condition to be alleviated. 

As used herein, "pharmaceutically acceptable 

25 carrier" includes any and all solvents , dispersion media, 

coatings, antibacterial and antifungal agents, isotonic and 
absorption delaying agents, and the like that are 
physiologically compatible . Preferably, the carrier is 
suitable for parenteral administration. More preferably, 

30 the carrier is suitable for administration into the central 
nervous system (e.g., intraspinally or intracerebral ly) . 
Pharmaceutically acceptable carriers include sterile 
powders, aqueous solutions and dispersions, for exaitple, 
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water, ethanol, polyol (for example, glycerol, propylene 
glycol, liquid polyethylene glycol, and the like), and 
suitable mixtures thereof. 

Sample 1 : Generating ^ol^e V t±d m ,,M th f ended 
p^i ygl lit amine„ regions 

Xn order to circumvent difficulties associated with 
the propagation of long CMS repeats in bacteria, a cloning 
strategy was developed which used a mixture of CAG/C&& 
codons which encode 25 glutamine residues {normal in 
Huntington disease {HD} } , 104 glutamine residues (known to 
be pathological in HD) , and 191, 230, and 300 glutamine 
residues Call of which are longer than the longest 
polyglutarnine regions observed in HD) . 

In order to synthesize a sequence which has a 
mixture of C&G/C&A codons, CAA CAG CAG CM ChQ CAA (S1Q IB 
N0:1) and complementary TTG TTG CTG TTG CTG CTG CSEQ ID 
NO 5 2} oligonucleotides were annealed to generate double 
stranded duplex DNA with trinucleotide extensions. These 
short duplex DNA molecules were used as starting material 
for two consecutive ligations to obtain sequences that 
contain a mixture of CAG/CAA codons {CPA, CAG CAG CM CAG 
CAS,} (SEQ ID NO:l) of different lengths. The ligation 
reaction was terminated by addition of dsDHJk linkers that 
included S f trinucleotide extensions and the restriction 
sites Hindi II at 5' and PstI at 3' with respect to the 
CAG/cm DNA strand. Sequences containing the mixture of 
CAG/CI-A co dons were subcloned into Bluescript~KS vector and 
maintained in XL-1 Blue (Stratagene, La Jolla, CA) . The CAA 
CAG CAG CAA CAG CAA (SEQ ID NO:l) consensus was verified by 
two-strand sequence analyses. Proteins encoded by the BS3A. 
were found to be stable in bacteria. 
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To generate mammalian egression constructs, a short 
huntirigtin N- terminus cDNA fragment, including Ko^ak box., 
start codon and first sixteen amino acids, was amplified by 
PGR (Amplitaq, Perkin Elmer) , ligated with various 
5 polyglutamine repeats (25Q, 104Q, 191Q, 230Q, 250Q, and 
300Q) , and subcloned into pcDNA 3.1 (Xnvitrogen, Carlsbad, 
CA) . 

To monitor the formation of aggregates in, cells, 
polyglutaminea were fused at the carboxy (C) -terminus with 
10 either a 28 amino acid c-myc tag, or with a 230 amino acid 
enhanced green fluorescent protein tag (EGFP; Clontech, Palo 
Alto, CA) . The first methionine of the E6FP sequence was 
replaced by lysine in the polyglutamine (polyQ) /EGFP 
constructs . 

15 

Examale^g : Extended polyp polypeptides form CYtoplasBie..and 
periaucleg£_aggregates 

The ability of synthetic polypeptides containing 
extended polyQ regions to form aggregates was tested in 

20 cells such as- COS -1, COS-7, NTH 3T3, 2S3, EcR~293, eHela, 
WT-2, and PC-12. Cells were grown on cover slips to 50% 
confluence and lipofected for two hours with Transfectam 
reagent (Promega, Madison, WI) and plasmids encoding the e~ 
myc~ tagged or EGFP -tagged polyglutamines of various lengths. 

25 Polyglutamine aggregation was assayed from 16 to 72 hours 

after transf ection. Cells were fixed in 2% formaldehyde/0,1% 
Triton-XIOQ for 10 minutes and incubated with primary mouse 
monoclonal anti -c-myc { Xnvitrogen/ Carlsbad, CA) antibody 
(1:500) and secondary FluoroLink Cy3 m (Amersham Life 

30 Science) antibody (1:2000). Nuclei were stained with 4*~8~ 
diamidino-2-phenylindole (DAPI) . Epif luorescent microscopy 
was performed on a Zeiss Axioplan II™ equipped with a 
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Quantix CCD™ camera (Photometries, Tuscan, AZ) and IPLab 
Spectrum™ imaging software {Scanalytics, Fairfax, YA) , 

Normal length (25Q) synthetic polyglutamines always 
showed diffuse cytoplasmic expression by fluorescence 
5 microscopy. In contrast, extended polygiutamines (104Q, 
191G, 230Q, 300Q) were found to aggregate in the mammalian 
cell lines tested. Early in the time course of 
precipitation, extended polyQ formed small, star-like 
aggregates, which were detected in COS-l cells as early as 

10 16 hours after transfection, Within 36 hours after 

transection, polyQ aggregates grew into dense, brilliantly 
fluorescent spherical structures, which could be as large as 
4 to 5 microns. Polyglut amine aggregates were found to be 
located exclusively in the cytoplasm, often in the 

IS perinuclear space or associated with the nuclear merabrans. 
While no significant difference in number or else of 
aggregates using either the e-myc or the EGFP-tagged 
extended polyQ constructs was observed, the polyQ/c-srjyc 
aggregates which were detected by means of fluorescent 

20 antibody stained the. aggregate with an intense peripheral 
rim. Thus, polyQ aggregates form a very dense structure 
which is impenetrable to the antibody. In contrast, the 
fluorescence of the intrinsically fluorescent polyQ/EGFP 
aggregates typically came from inside the core of the 

25 aggregate. Since denatured EGFP lacks fluorescence, this 

observation suggested that polypeptides inside the aggregate 
were at least partially in native form. 

Exaiipl,e^jL^^^ 

flanking sequence of ^extended polyQ 
30 Polyglut amine aggregates have been found in HD brain 

in dystrophic neurit es in the cortex and white matter and as 
nuclear inclusions in the striatum. Using the cell culture 
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described herein, extended polyQ aggregates were found in 
the cytoplasm of transfected cells, as opposed to the 
nucleus. Since the constructs have only a small N-terminal 
fragment of- huntingtin, this may mean that the 
5 C-teminus of huntingtin includes a nuclear localisation 
signal (NLS) . Such a putative KLS would cause slow 
accumulation of mutant huntingtin in the nucleus and 
eventual formation of nuclear inclusions. To test the 
effect of a strong MLS on the subcellular localisation of 
10 aggregates, the nucleolin protein (650 amino acids), which 
has strong nuclear and nucleolus localization signals, was 
chosen. 

Expression constructs {polyQ/nucleolin/EGFP} were 
generated by inserting nucleolin cDHS. between polyQ -encoding 

15 and EGFP-encoding sequences as follows. A nucleolin cIMA 
sequence was amplified by PGR and inserted between 
polyglutamine {25Q, IG4Q, 3 OQQ) -encoding and EGFP-encoding 
sequences in HD polyQ EGFP constructs. The first methionine 
codon of nucleolin cDMA was changed to a lysine codon in 

20 these polyQ/nucleolin/EGFP constructs. 

When normal length poly25Q/nucleolin/EGFP and 
extended polyQ/nucleolin/EGFP fusion proteins were stressed 
in cells , it was found that all polypeptides were located in 
the nucleus , and particularly in the nucleoli. Moreover, 

25 the length of the fusion protein, which ranged up to 

1200 amino acids, did not limit nuclear translocation and 
aggregation. No fluorescent signal was seen in the 
cytoplasm. 

In another approach, the naturally occurring 
30 stretch of 38 polyglutamines in the TATA-binding protein 
CTBP) was extended to 104Q so as to target aggregate 
formation to the nucleus and directly study the effects of 
polyQ aggregation in the nucleus. Wild- type TBP cDSm. was 
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amplified from genomic DN& extracted from Ehela, To replace 
the native 38Q homopolyraeric stretch in the TBP sequence, 
sequences encoding N- terminal and C~terminal fragments of 
TBP (Gesibank accession # M55654) were amplified by PCR with 
S primers introducing novel internal HindXXI and PstI 
restriction sites at nucleotide positions 412 and 521, 
respectively. DMA fragments encoding 25Q f 42Q, S5Q and 104Q 
were ligated with the sequences encoding N~ terminal and 
C-terminal TBP fragments, and subclonad into pcDm 3.1. 
10 Finally, the TBP proteins were tagged with c~mye at the 
C- terminus. 

As predicted, TBP/42Q, TBP/65Q and TBP/104Q formed 
multiple aggregates, which were clearly located within the 
nucleoplasm. The subcellular localisation of polyQ 

15 aggregates was determined by polyQ flanking sequences. 
Additionally, it was found that a sequence of at least 
1000 amino acids can be translocated into the nucleus by 
active transport and aggregated without cleavage. 

In previous experiments, polyQ aggregates were found 

20 exclusively in the cytoplasm, unless the extended polyQ 
construct included nucleolin. To determine whether the 
strong nuclear localization signal could also function in 
trans, a construct encoding an extended 

poly!04Q/nucleolin/EGFP fusion protein was coexpressed with 
25 a construct encoding extended polyQ/ c-tmyc, which lacks an 

NLS, In this experiment, the polyQ/c-rayc fusion protein was 
detected in heterogeneous aggregates in the nucleus . Nuclear 
localization was strictly dependent on co- aggregation with 
poly!04Q/nucleolin/EGFP fusion protein. Despite the 
30 presence of the strong nuclear localization signal in cis f 
it was found that polyQ/nucleolin/EGFP also aggregated with 
polyG/c-myc in the cytoplasm of some cells,, and was excluded 
from the nucleus, Thus, subcellular localisation of 



29 - 



WO 01/23412 



aggregation depends in general upon the functional 
characteristics of the protein in which the polyQ is 
embedded. Nonetheless, strong intermolecular interactions 
mediated by polyQ domains can in some cases be sufficient to 
5 override the effects of such intrinsic localization signals. 

Exajri5l e^ .4,, ; ,.„Certain polyqlutamtne-cq^^^^ 
proteins can co- aggregate with extended polvO 

To establish whether a normal length polyglutamine 
polypeptide of 25 glutamine residues can interact with and 

10 perhaps aggregate with extended polyglutarniaes , the 

polypeptides were co-expressed in cells as follows. Mormal 
length poly25Q/EGFP and extended poly!04Q/c-?nyc were 
expressed alone or co-expressed, Normal length polyQ 
polypeptides showed a diffuse pattern of expression when 

15 expressed alone. Remarkably, these same normal length 

polyglut amines were recruited into cellular aggregates when 
they were coexpressed with extended polyglutamines . In 
contrast, when EGFP lacking a polyQ segment was co-expressed 
with extended polyQ/c-myc, EGFP fluorescence was not 

20 detected in aggregates. Co-expression experiments using 
poly25Q/nucleolin/EGFP and extended polyQ/c-myc yielded 
flourescent co-aggregates in nucleoli, whereas 
EGFP/nucleolin co-expressed with extended polyQ/c-tsiyc gave 
cytoplasmic aggregates which had no EGFP signal. These 

25 results demonstrate the strict polyglutamine-dependent 
nature of the co~ aggregation phenomenon. The results 
further suggest that intermoiecular interactions that occur 
between mutant extended polyQ and normal cellular proteins 
with significant glutamine stretches (below a threshold of 

30 31) tiray Piay a role in the cellular pathology of the 

polyglutamine neurodegenerative disorders. To investigate 
this possibility in the cell culture system, the nuclear 
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transcriptional co-activator CREB -binding protein (CBP) , 
which, like several such proteins, has a glutamine-rich C- 
terminus within which is a homopolymeric stretch of 19 
glutamine residues, was tested. 
5 To express full length CBP, CPB cDH& 

(GenBank accession # U47741) was cloned into the expression 
vector pcDNA 3.1. The sequence encoding the polyglut amine- 
rich domain near the C- terminus of CPB was removed by 
digesting with Sacll and Xbal, amplifying the 3 f -ead 

10 fragment with primers containing Sacll and XfoaX sites, then 
fusing with the 5' -end fragment. This results in 
approximately a 200 amino acid deletion {6652-7228 nt) at 
the C-terminus which removes a polyglutamine-rich fragment, 
including a 19Q homopolyroeric stretch. Both full length and 

15 deletion proteins were tagged with c-myc at the C-terminus. 

Co-expression of the 2SQ/EGFP" construct with the 
construct encoding c-myc~tagged, full-length CBP showed a 
diffuse cytoplasmic localization. In contrast, when c~myc- 
tagged, full-length CBP was coexpressed along with extended 

20 polyQ constructs, CBP was detected within cytoplasmic 

aggregates. As predicted, CBP which was co-expressed with 
104QTBP forms aggregates in the nucleus. In sharp contrast , 
however, when CBP was co-expressed with 104Q/nucleolin } no 
CBP was detected in the nucleolar aggregates. This failure 

25 to detect CBP in the nucleolar aggregates can be explained 
by the subcellular and subnuclear location of CBP itself. 
While transfected CBP is seen strongly in the nucleus and 
weakly in the cytoplasm, it is clearly excluded from 
nucleolar bodies. These results suggest that the likelihood 

30 that an endogenous cellular polyQ containing protein will be 
found in a polyQ aggregate may depend on the interplay 
between a number of factors including local concentration 
within the ceil. In particular, the ability of mutant 
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extended polyQ to recruit cellular proteins into aggregates 
tray be exquisitely dependent on co- localization within 
precise subcellular compartments . To determine whether co- 
aggregation of CBP and 104Q/EGFP is mediated by the 
5 homopolymeric polyQ domain, a CBP construct which encodes a 
c-myc-tagged CBP lacking the glutamine-rieh region was c©~ 
transfected with a construct encoding 104Q/1GFF. No c-myc 
signal was detected in the aggregates in the majority of co~ 
transfected cells. Thus, it was clearly demonstrated that 
10 recruitment of cellular proteins into polyQ aggregates is 
dependent upon interactions between polyQ doraains. 

BaEBlfe-S-' ,.J^ol^M.mlY.Q...mm;mmes shield entrapped 
polypeptides and protect them from denaturation,, even under 
harsh_condi^lqns^„., 

X5 In order to investigate further the nature of the 

very insoluble aggregates, a novel in situ assay to test the 
resistance of cellular aggregates to high concentrations of 
detergents was developed. Cells were transfected with a 
construct encoding either normal length p©ly25G/EGFF, or 

20 extended polylQ4Q/EGPP. Forty hours later/ cells were 
treated in situ with SDS/Triton X-100, at various 
concentrations as high as 5% SDS/5% Triton 1-10Q, overnight 
at room temperature. High intensity EGFP fluorescence was 
detected in surviving aggregates formed by extended 

25 polylQ4Q/EGFP polypeptides after treatment with detergents. 
In contrast, soluble, non- aggregated EGFP and poly2SQ/lGFP 
were completely denatured by detergents , and EGFP 
fluorescence was no longer detected. The results show that 
high concentrations of detergents are unable to destroy 

30 polyQ aggregates or to denature the native EGFP structure 
once it is bound in an aggregate. 
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gSSHBlS -fi .1. Ex tend ed polyQ peptides trap 8ol^Le.^„...np,rial 
length po.lyQ, peptides into insoluble, detergent -resistant 
aggregates in cells 

To demonstrate directly that extended polyQ peptides 
5 are able to trap normal length polyQ peptides into insoluble 
aggregates in the cell, and are not simply loosely 
associated or co-localized, co-aggregates were treated with 
high concentrations of detergents as described above. Cells 
were co-transfected with constructs encoding poly25Q/EGFP 

10 and extended polyQ104/c-rayc f producing co - aggregates , These 
co-aggregates were extracted from the cells and were shown 
to be resistant to high concentrations of detergents. The 
fluorescence due to protection by the co-aggregation of 
native normal length poly25Q/EQFP with extended polyQ was 

15 identical to the fluorescence in aggregates formed by 

' extended poly!04Q/EGFP alone. In control essperiments , cells 
were co-transfected with constructs encoding EGFP Clacking a 
polyQ segment) and extended polyG/c-rayc. Treatment with 
detergent denatured both soluble poly25Q/EGFP and EGFP 

20 lacking a polyQ segment, such that EGFP fluorescence was no 
longer detected. Thus, detergent-resistant insolubility was 
shown to be dependent upon interactions between the polyQ 
stretches themselves. 

»JSSm±&- X'*- .SYstem for .high throughput .^greening for agents 

25 t^_cLEP^ ess polyalutamine aggregation 

The above results indicate that aggregated 
polyglutamines can form strong interactions. Molecules that 
might interact with extended polyglut amines and thereby 
suppress aggregation would be of potential therapeutic 

30 benefit. Identification of such tnolecules in a cell culture 
system would require the reliable induction of extended 
polyglutamine aggregates. As a first step toward this goal, 
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ecdysona- inducible mammalian cell lines that expressed E6FP- 
tagged 2SQ, 104Q and 300Q in a ligand- dependent manner were 
generated as follows, 

EGFP and polyQ/EGFP fusions with 2SQ f 104Q and 300Q 
5 were subcloned into plND DNA vector (Invitrogen, Carlsbad, 
C&) . EcR-293 cells (Invitrogen, Carlsbad, CA) were 
transfected with pISD DMA using Transfectam IM reagent 
(Fromega, Madison, wi) . Stable integrants were selected in 
0.4 mg/tnl G418 and 0,4 ng/ml Zeocin. PolyQ expression and 

10 aggregation were tested in isolated cell lines by induction 
with 0-20 Muristerone A and Ponasterone A. Cells from 
selected clones showed uniform fluorescence after induction. 
The expression of polypeptides containing an extended 
polyglutaraine region and the number of aggregates formed 

15 were dose -dependent. Aggregates were detected as early as 
24 hr after induction, with maximum appearance at 48-72 hr. 
Typically, 2-3 aggregates per colony of 12-16 cells were 
observed 48 hours after induction with 10 Muristerone A, 
Transfected cells, induced with Muristerone, were 

20 harvested from Petri dishes, washed twice with PBS f lysed 
with 0.3% NP-40, and washed with 0.1% Triton X-10G/PBS. 
Simultaneously, floating aggregates from dead cells were 
pelleted from culture media by high-speed cent rifugat ion, 
washed with PBS and 0.1% Triton X-100/PBS and combined with 

25 cell lysates. Aggregates were washed twice with 1% Triton 
X-10Q/PBS for 1 hour (or -overnight) and washed in 0.1% SDS. 
Aggregates were pelleted after each wash by centrifugation, 
Finally, semi -purified aggregates were incubated with 2-5% 
SDS/2-5% Triton X-100 mix for 1-48 hours at room temperature 

30 (37°C) . Alternatively, live cells expressing polyQ/EGFF 

fusions were lysed in situ with 2-5% SDS/2-5% Triton-X-100 . 

When treated in situ with 5% SDS/5% Triton X-100, 
EGFP fluorescence was protected in polyQ aggregates even 
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after 48 hours of detergent treatment at 37°C. In contrast/ 
soluble, non-aggregated, EGFP-tagged material was denatured 
instantly and completely, and fluorescence no longer 
detected. .This fluorescence quenching means to assess 
3 solubility is simple to generate, highly reproducible, and 
straightforward to score s due to the very intense 
fluorescent signal generated by the aggregated 
polyglutamines compared to the total absence of fluorescence 
seen with soluble molecules treated with denaturant. 

10 What is claimed is; 
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CLAIMS 

1 1. A method of identifying a compound which 

2 disrupts polypeptide aggregation, the method comprising; 

3 providing a first polypeptide labelled with a detection 

4 moiety which is inactivated in the presence of a denaturant; 

5 providing a second polypeptide, wherein the first 

6 polypeptide and the second polypeptide aggregate upon 

7 contact ; 

8 contacting the first polypeptide, the second 

9 polypeptide and a test compound to form a mixture ; 

10 contacting the mixture with the denafcurant; and 

11 determining activity of the detection raoiety f 

12 wherein a decrease in said activity following contact of the 

13 mixture with the denaturant indicates that the test compound 

14 is a polypeptide aggregation disrupting compound, 

1 2, The method of claim i, wherein the first and 

2 second polypeptides are selected from the group consisting 

3 of polypeptides containing an extended polyglutamine region, 

4 beta-amyloid polypeptides / .tau proteins, presenilis alpha- 

5 synucleins and prion proteins. 

1 3. The method of claim 2, wherein the first and 

2 second polypeptides are polypeptides containing extended 

3 polyglutamine regions selected from the group consisting of 

4 huntingtin, atropin-1, ataxin-l, ataxin-2, ataxin~3 f ataxin- 

5 7, alpha 1A, and androgen receptor. 

1 4. The method of claim l,- wherein the first and 

2 second polypeptides are non-naturaily occurring polypeptides 

3 comprising at least 32 consecutive giutamine residues. 
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1 5. The method of claim 1, wherein the first 

2 polypeptide is labeled with a fluorescent protein. 



1 6- The method of claim 4 f wherein the first 

2 polypeptide is labeled with a fluorescent protein. 

1 7, The method of claim i, wherein the first 

2 polypeptide or second polypeptide is immobilized. 

1 8. The method of claim 1, wherein the first and 

2 second polypeptides are in solution. 

1 9 . A method of identifying a compound which 

2 disrupts the aggregation of polypeptides containing extended 

3 polyglutamine regions, the method comprising; 

4 providing a f luorescently labelled first 

5 polypeptide, wherein the first polypeptide contains an 
S extended polyglutamine region; 

1 providing a second polypeptide containing an 

8 extended polyglutamine region ; 

9 contacting the first polypeptide, the second 

10 polypeptide and a test compound' to form a mixture; 

11 denaturing unaggregated polypeptides in the mixture? 

12 and 

13 detecting fluorescence, wherein a decrease in 

14 fluorescence in the presence of the test compound indicates 

15 that the test compound is a polypeptide aggregation 
IS disrupting compound. 

1 10. The method of claim 9, wherein the first and 

2 second polypeptides are non-natural ly occurring polypeptides 

3 comprising at least 32 consecutive Gin residues. 
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1 11. The method of claim B, wherein the first and 

2 second polypeptides are selected from the group consisting 

3 of huntingtin, atropin-1, ataxin-1, ataxin-2, ataxin-3, 

4 ataxin-7, alpha 1A, and androgen receptor. 

1 12, The method of claim 9, wherein the first or 

2 second polypeptide is immobilized. 

1 13. The method of claim 9, wherein the first and 

2 second polypeptides are in solution. 

1 14. A method of identifying a compound which 

2 disrupts the aggregation of polypeptides containing extended 

3 polyglutamine regions, comprising; 

4 providing a cell which is genetically modified to 

5 express a Dm encoding a heterologous polypeptide containing 

6 an extended polyglutamine region; 

7 contacting the cell with a test compound; and 

8 determining whether the test compound decreases the 

9 amount of aggregation of the polypeptide is the cell, 
.0 wherein a decrease in polypeptide aggregation in the 
.1 presence of the test compound indicates that the test 

.2 compound is a polypeptide aggregation disrupting compound. 

1 15. The method of claim 14 , wherein the 

2 heterologous polypeptide is a fusion protein comprising a 



16. The method of claim 15, wherein the label is 
fluorescent protein or an enssyme. 



1 17. 



2 



The method of claim 15 , wherein the label 



green fluorescent protein or a blue £1 



: protein. 



1 18. The method of claim 15, wherein the label is a 

2 fluorescent protein and the determining step comprises*. 

3 contacting the cell with a denaturaat ; and 

4 detecting fluorescence, wherein a decrease in 

5 fluorescence in the cell contacted with the test compound 
compared to a control cell, indicates that the teat compound 
is a polyglutamine polypeptide aggregation disrupting 



6 



1 19- The method of claim 14, wherein the ^,^ SJUWii 

2 of the polypeptide is induced upon exposure of the cell to 

3 an inducing agent. 

1 20. The method of claim 19, wherein the inducing 

2 agent is ecdysone or muristerone, 

1 21. A method of identifying a conipoxmd which 

2 disrupts the aggregation of polypeptides, the method 

3 comprising ; 

4 providing a cell that is genetically modified to 

5 express a DNA encoding a heterologous polypeptide, wherein 

6 molecules of the polypeptide spontaneously aggregate within 

7 the cell; 

8 contacting the cell with a test compound ; and 

B determining whether molecules cf the polypeptide aggregate 

10 in the presence of the test compound, wherein a decrease in 

11 aggregation of the polypeptide molecules in the presence of. 

12 the test compound indicates that the test compound is a 

13 polypeptide aggregation disrupting compound. 

1 22. The method of claim 21, wherein the polypeptide 

2 is a fusion protein comprising a label. 
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1 23. The method of claim 22 wherein the label is a 

2 fluorescent protein or an enzyme. 

1 24. The method of claim 22, wherein the label is a 

2 green fluorescent protein or a blue fluorescent protein. 

1 25. The method of claim 24, further comprising; 

2 contacting the cell with a denaturant; and 

3 detecting fluorescence, wherein the label is a green 

4 fluorescent protein and a decrease in fluorescence in the 

5 cell contacted with the test compound coirspared to a control 

6 cell, indicates that the compound is a polypeptide 

7 aggregation disrupting compound, 

1 26. A DMA encoding a fusion protein comprising (a) 

2 at least 32 contiguous glutamine residues and (b) a label, 

3 wherein the sequence encoding the at least 32 glutamine 

4 residues comprises both CAG codons and cm, codons, 

1 27. The DMA of claim 26, wherein the C&G codons and 

2 CM codons alternate. 

1 28- The USA of claim 26, comprising the sequence 

2 CCAA CAG CAG OUk CAG QkA) n (SEQ. ID N0:1) . 

1 29. The DNA of claim 26, wherein the label is a 

2 fluorescent protein or an enzyme. 

1 30. The DNA of claim 26, wherein the label is a 

2 green fluorescent protein or a blue fluorescent protein. 

1 31. A fusion polypeptide comprising (a) at least 32 

2 contiguous glutamine residues and (b) a fluorescent protein. 
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1 32. The fusion polypeptide of claim 31, wherein the 

2 fluorescent protein is a green fluorescent protein or a blue 

3 fluorescent protein. 

1 33. An expression plasmid comprising the DMA of 

2 claim 26, operably linked to an expression control sequence. 

1 34. A cultured, genetically modified cell which 

2 expresses the DNA of claim 26. 

1 35. A method of producing a fusion protein, 

2 comprising culturing the cell of claim 34 under conditions 

3 appropriate for expressing the DNA encoding the fusion 

4 protein . 
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